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ABSTRACT 

We use a statistical approach to determine the relationship between the stellar masses of galaxies and the 
masses of the dark matter halos in which they reside. We obtain a parameterized stellar-to-halo mass (SHM) 
relation by populating halos and subhalos in an A/-body simulation with galaxies and requiring that the observed 
stellar mass function be reproduced. We find good agreement with constraints from galaxy-galaxy lensing and 
predictions of semi-analytic models. Using this mapping, and the positions of the halos and subhalos obtained 
from the simulation, we find that our model predictions for the galaxy two-point correlation function (CF) as a 
function of stellar mass are in excellent agreement with the observed clustering properties in the SDSS at z = 0. 
We show that the clustering data do not provide additional strong constraints on the SHM function and conclude 
that our model can therefore predict clustering as a function of stellar mass. We compute the conditional mass 
function, which yields the average number of galaxies with stellar masses in the range m ± dm/2 that reside in 
a halo of mass M. We study the redshift dependence of the SHM relation and show that, for low mass halos, the 
SHM ratio is lower at higher redshift. The derived SHM relation is used to predict the stellar mass dependent 
galaxy CF and bias at high redshift. Our model predicts that not only are massive galaxies more biased than 
low mass ones at all redshifts, but the bias increases more rapidly with increasing redshift for massive galaxies 
than for low mass ones. We present convenient fitting functions for the SHM relation as a function of redshift, 
the conditional mass function, and the bias as a function of stellar mass and redshift. 

Subject headings: cosmology: theory — dark matter — galaxies: clusters: general — galaxies: evolution — 
galaxies: halos — galaxies: high-redshift — galaxies: statistics — galaxies: stellar content 
— large-scale structure of universe 
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1. INTRODUCTION 

In the standard Cold Dark Matter (CDM) paradigm, the 
formation of galaxies is driven by the growth of the large- 
scale structure of the Universe and the formation of dark mat- 
ter halos. Galaxies form by the cooling and condensation of 
gas in the centers of the potential wells of extended virial- 
ized dark matter halos (IWhite & Reesll9 78: Fall & Efstathioul 
Il980t [Blument hal et alj|1984l) . In this picture, galaxy proper- 
ties, such as luminosity or stellar mass, are expected to be 
tightly coupled to the depth of the halo potential and thus to 
the halo mass. 

There are various different approaches to link the properties 
of galaxies to those of their halos. A first method attempts to 
derive the halo properties from the propertie s of its galaxy 

population using e. g. g alaxy kinematics (lErickson et alJ 

1987b IZaritskv et alJTT993l: ICarlberg et all 119961: iMore et al l 



2005 



2003 



2009 ajbJ), gravitational le nsing (Mandelbaum et al. 
2006tlCacciato et al.ll2008l) . or X-ray studies dLin et all 
Lin & Mohril2004l) . 

A second approach is to attempt to model the physics that 
shapes galaxy formation ab initio using either lar ge numeri- 
cal simulations including both gas a nd dark matter dKatz et all 
1996; Springel & Hernquist 2003) or semi-analytic models 
(SAMs) of galaxy formation (e.g. iKauffmann et alJ 1 1993b 
ICole et alJH994t ISomerville & Primacld fl999l In "hybrid" 
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SAMs (e.g. lCroton et al.ll2006t feower et al. 2006), dark mat- 
ter "merger trees" are extracted from a dark matter only N- 
body simulation, and gas processes are treated with semi- 
analytic recipes. An advantage of this method is that high- 
resolution N-bo dy simulations can track the evolution of indi- 
vidual subhalos ( Kl vpin et al.ll 1999b ISpringel et al.ll200ll) and 
thus provide the precise positions and velocities of galaxies 
within a halo. However, many of the physical processes in- 
volved in galaxy formation (such as star formation and var- 
ious kinds of feedback) are still not well understood, and in 
many cases simulations are not able to reproduce observed 
quantities with high accuracy. 

With the accumulation of data from large galaxy surveys 
over the last decade, a third method has been developed, 
which links galaxies to halos using a statistical approach. 
The Halo Occupation Distribution (HOD) formalism speci- 
fies the probability distribution for a halo of mass M to har- 
bour N galaxies with certain i ntrinsic properties, such as lu- 
mino s ity, color, or type (e.g. iPeacock & Sm ith 2000; ISeljakl 
120051 1 White! l200Tt iBerlind & Weinberg! |2002l) . More com- 
plex formulations of this kind of modelling, s uch as the condi- 
tional luminosity function (CLF) formalism (|Yang et alJ2003b 
Ivan den Bosch et alJ l200l lYang et al.1 1 2004 have extended 
the HOD approach. These methods have the advantage that 
they do not rely on assumptions about the (poorly understood) 
physical processes that drive galaxy formation. In this way, it 
is possible to constrain the relationship between galaxy and 
halo properties (and thus, indirectly, the underlying physics), 
and to construct mock catalogs that reproduce in detail a de- 
sired observational quantity (such as the luminosity function). 
One disadvantage of the classical HOD approach was that one 
had to make assumptions about the distribution of positions 
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and velocities of galaxies within their host halos. In addition, 
the results of the HOD modelling can be difficult to interpret 
in terms of the underlying physics of galaxy formation. 

In recent years, HOD models have been introduced that 
make use of information about the positions, velocities and 
masses of halos and subhalos extracted from a dissipation- 
less N-body simulation. The (sub)halo mass is then em- 
pirically linked to galaxy properties by requiring that a sta- 
tistical observational quantity (e.g. galaxy luminosity func- 
tion and/or galaxy two-point-correlation-function) is repro- 
duced. This is either done by assuming parameterized func- 
tions to relate galaxy properties (such as luminosity) to 
halo mass or by assuming a non-parametric monotonic re- 
lation. It has been shown that these simple models repro- 
duce galaxy clust ering as a function of luminosity over a wide 
range in redshift dKravtsov et al.ll2004l:lT asitsiomi et al.l | 2004 ; 
Tinke r et alj|2005t IVale & Ostrikedl2006t IConrov et alj|2006 ; 
Shan kar et alj|2006t IWang et al.ll2006tT Marin et al.l l2008l) . 

Observationally, it is well known that galaxy clustering is 
a function of spatial scale, galaxy properties (such as lumi- 
nosity and type), and redshift. Luminous (massive) galaxies 
are more strongly clustered than less luminous (less massive) 
galaxie s (Norbe rgetini2lMl20QirZehavi et alJl2Q0ll2Q0a 
iLi et alj|2006l) . One can split the galaxy two-point correlation 
function (2PCF) into two separate parts: the one-halo and the 
two-halo terms. The one-halo term, which dominates on small 
scales, depends strongly on the galaxy distribution within the 
halo as well as the details of the HOD. The two-halo term, 
which dominates on scales that are much larger than a typical 
halo, is proportional to the auto-correlation of the halo popu- 
lation. In general the two terms are not expected to combine to 
produce a featureless power-law, but generally show a break 
or dip at the scale where the transition from the one-halo to 
the two-halo term occurs (IZehavi et al.ll2004 . 

The extensive multi-wavelength spectrophotometric infor- 
mation that is now available for large numbers of galaxies al- 
lows us to estimate physical parameters of galaxies, such as 
stellar masses, instea d of relying on observational properties 
such as magnitudes dBell & de Jongi [20011 : iKauffma nn et al.l 
l200l iPanter et al.l 12004 . These estimates can even be ob- 
tained — with a proper measure of caution — for high red- 
shift galaxies. Stellar mass estimates have been presented 
in the literature for galaxies up to r edshifts as high as z ~ 6 
(lYanet al l 120061: lEyles et al.l 120071) . and stellar mass func- 
tion estimates have been presented up to z ~ 5 dDrorv et alj 
120051: iFontana etai1l2006l lElsner et alj|2008l) . The goal of 
our paper is to develop a "Conditional Stellar Mass Function" 
(CMF) formalism, which is the stellar mass analog of the 
CLF The CMF yields the average number of galaxies with 
stellar masses in the range m ± dm as a function of the host 
halo mass M and can be regarded as the stellar mass function 
(SMF) for halos of mass M. We apply this formalism at low 
redshift and up to the highest redshifts where reliable obser- 
vational stellar mass estimates are available (0.1 < z < 4). In 
this way, we derive a parameterized relationship between dark 
matter halo mass and galaxy mass as a function of redshift. 

Using a parameterized relationship has several advantages. 
First, it provides a convenient way for other researchers to 
make use of our results and obtain an expression for stellar 
mass as a function of halo mass. Second, it is straightfor- 
ward to include scatter in the relation, which is physically 
more realistic: one just has to choose a number drawn from 
an assumed random distribution and add that to the average 
relation. Finally, it is straightforward to treat central and satel- 



lite galaxies separately and assume different relations between 
stellar and halo mass for those populations. However, here we 
make the assumption that both populations follow the same 
relation, which has consequences for the clustering predic- 
tions of our model. 

Using the CMF derived only from constraints from the ob- 
served SMF, we compute the predicted (projected) galaxy CF 
at z ~ as a function of stellar mass, and find good agree- 
ment with the observational results of ILi et all (120061) . Fur- 
thermore, we show that assuming central and satellite galax- 
ies follow the same relation between stellar and halo mass, 
adding the clustering constraints does not tighten the con- 
straints on our model parameters; i.e., any model that satisfies 
the mass function constraints will produce the correct clus- 
tering. Based on this result, we use our redshift-dependent 
CMF results to predict the clustering as a function of stellar 
mass and redshift. To date, observational measurements of 
clustering as a fu nction of stellar mass have only been pub- 
lished for z < 1 dMeneux etal.ll2008l |2009). We show that 
our model predictions agree very well with these measure- 
ments. Very soon it will be possible to test our predictions for 
redshifts beyond z = 1 with the results from deep wide-field 
surveys (e.g. MUSYC, UKIDDS, etc). We again present con- 
venient fitting functions for the galaxy bias as a function of 
both stellar mass and redshift. In a companion paper we will 
employ our estimates of galaxy bias in order to compute the 
"cosmic variance", the uncertainty in observational estimates 
of the volume density of galaxies arising from the underlying 
large-scale density fluctuations. 

This paper is organized as follows: in section [2] we de- 
scribe the iV-body simulation, the halo finding algorithm that 
was used to obtain a halo catalogue and the treatment of 'or- 
phaned' galaxies. Section[3]specifies our model: we motivate 
the form of the stellar-to-halo mass (SHM) relation and con- 
strain it by requiring that the observed SMF is reproduced. 
The clustering properties of galaxies are then inferred from 
those of the halo population. We discuss the meaning of the 
parameters of the SHM relation and demonstrate that clus- 
tering puts only weak constraints on them. In section we 
introduce the CMF, which describes how halos are occupied 
by galaxies, and compute the occupation numbers. Section 
[6] gives a comparison between our results and several other 
models and observations. In section [7] we apply our method 
to higher redshifts and determine the redshift dependence of 
the SHM relation. We make predictions of the stellar mass 
dependent galaxy CF at higher redshift which we use to com- 
pute the galaxy bias. Finally, we summarize our methods and 
conclusions in section[8] 

Throughout this paper we assume a ACDM cosmology with 
(n m ,n A , h,as,n ) = (0.26,0.74,0.72,0.77,0.95). We employ a 
iRroupal (12001 1) initial mass function (IMF) and convert all 
stellar masses to this IMF. In order to simplify the notation 
we will use the capital M to denote dark matter halo masses 
and the lower case m to denote galaxy stellar masses. 

2. THE SIMULATION AND HALO CATALOGS 

High-resolution dissipationless N-body simulations have 
shown that distinct halos 4 contain subhalos which orbit within 
the potential of their host halo. These subhalos were distinct 
halos in the past, and entered the larger halo via merging dur- 
ing the process of hierarchical assembly. We will refer to the 

4 We refer to virialized halos that are not subhalos of another halo as "dis- 
tinct". 



Stellar-to-Halo Mass Relationship 



3 



galaxy at the center of a distinct halo as a central galaxy, and 
the galaxies within subhalos as "satellites", and we will use 
the term 'halo' to refer to the distinct halo for central galaxies 
and to the subhalo in which the galaxy originally formed for 
satellite galaxies. 

Ab initio models of galaxy formation predict that the stellar 
mass of a galaxy is tightly correlated with the depth of the po- 
tential well of the halo in which it formed. For distinct halos, 
the relevant mass is the virial mass at the time of observa- 
tion. Subhalos, however, lose mass while orbiting in a larger 
system as their outer regions are tidally stripped. Stars are 
centrally concentrated and more tightly bound than the dark 
matter, however, and so the stellar mass of a galaxy which 
is accreted by a larger system probably changes only slightly 
until most of the dark matter has been stripped off. Therefore 
the subhalo mass at the time of observation is probably not a 
good tracer for the potential well that shaped the galaxy prop- 
erties. A better tracer is the subhalo mass at the time that it 
was accreted by the host halo, or its maxmimum mass o ver its 
history 5 . This was first proposed by lConroy et al.l (120061) . 

The population of dark matter halos used in this work is 
drawn from an j V-body simul ation run with the simulation 
code GADGET-2 (ISpringelll2005l) on a SGI AltixII at the Uni- 
versity Observatory Munich. The cosmological parameters 
of the simulation ar e chosen to match results from WMAP-3 
(Snerg eTet alJl2007h for a flat ACDM cosmological model: 
fl m = 0.26, n A = 0.74, h = #q/(100 km s" 1 Mpc" 1 ) = 0.72, 
us = 0.77 and n = 0.95. The initial condi tions were generate d 
using the GRAFIC software package (Bertschinger 2001). 
The simulation was done in a periodic box with side length 
100 Mpc, and contains 512 3 particles with a particle mass of 
2.8 x 10 8 M Q and a force softening of 3.5 kpc. 

Dark matter halos are identified in the simulation using a 
friends-of-friends (FoF) halo finder. Substructures inside the 
FoF group s are then identified using the SUBFIND code de- 
scribed in Spri ngel et alJ (1200 lh . For the most massive sub- 
group in a FoF group the virial radius and mass are determined 
with a spherical overdensity criterion: the density inside a 
sphere centered on the most bound particle is required to be 
greater than or equal to the value predicted by the spherical 
collap se model for a tophat p erturbation in a ACDM cosmol- 
ogy dBrvan & Normanlll998l) . As discussed above, for sub- 
halos we use the maximum mass over its past history, which 
is typically the mass when the halo was last a distinct halo 
and did not yet overlap with its later host. Merger trees were 
constructed out of the halo catalogs at 94 time-steps, equally 
spaced in expansion factor (Aa = 0.01), based on the particle 
overlap of halos at different time-steps. 

Due to the finite mass resolution of the simulation 
(M m inhaio — 1O 1O M0), subhalos can no longer be identified 
when their mass has dropped below this limit due to tidal 
stripping. Since mass loss can be substantial (>90%) this is 
important even for fairly massive subhalos. A special treat- 
ment of these so-called "orphans" is necessary. We determine 
the orbital parameters at the last moment when a subhalo is 
identified in the simulation and use th e m in the dynamical 
friction recipe of iBoylan-Kolchin et al.l (l2008h . which is ap- 
plicable at radii r < r vlr . We also tried an alternate recipe in 
which we make no explicit use of the subhalo information, but 
apply the dynamical friction formula from the time when the 

5 In an idealized situation, halo mass should increase monotonically with 
time until the halo becomes a subhalo, at which point the mass begins to 
decrease due to tidal stripping. 



satellites first enter the host halo. We obtained very similar 
subhalo mass functions and radial distributions with the alter- 
nate recipe, confirming the self-consistency of the approach. 

For the halo positions in the determination of CFs, we use 
the coordinates of the most bound particle for distinct and sub- 
halos. For orphans, by definition, the position is not known, so 
we follow the position of the most bound particle from the last 
time-step when a subhalo was identified. Since the dynami- 
cal friction force vanishes in the dark matter only simulation 
after a subhalo is dissolved, yet not in reality when a galaxy 
is present at the center of the subhalo, the distance to the cen- 
ter of the host halo might be slightly overestimated with this 
prescription. 

3. CONNECTING GALAXIES AND HALOS 

In this section we describe how we derive the relationship 
connecting the stellar mass of a galaxy to the mass of its dark 
matter halo. In the standard picture of galaxy formation, gas 
can only cool and form stars if it is in a virialized gravita- 
tionally bound dark matter halo (IWhite & Reeslll978l) . In this 
model the gas cooling rate, the star formation rate and thus 
the properties of the galaxy depend mainly on the virial mass 
of the host halo. Thus we expect the stellar mass of a central 
galaxy to be strongly correlated with the virial mass of the 
halo in which the galaxy formed. As we discussed in the last 
section, this corresponds to the virial mass for central galax- 
ies, and to the maximum mass over the halo's history for satel- 
lite galaxies. In the rest of this work, unless noted otherwise, 
the halo mass M will represent: 



M = 



M vir 



for host halos 
for subhalos 



(1) 



Note that we have also experimented with instead using 
the present mass for subhalos, and found that we were not 
able to reproduce the galaxy clustering properties (see also 
IConrov et al]|2006l) . 

3.1. The stellar-to-halo mass relation 

In order to link the stellar mass of a galaxy m to the mass 
of its dark matter halo M we need to specify the SHM ra- 
tio. A direct comparison of the halo mass function n(M) and 
the galaxy mass function cf>(m) helps to constrain the stellar- 
to-halo mass function. If we assume that every host (sub) 
halo contains exactly one central (satellite) galaxy and that 
each system has exactly the same SHM ratio m/M, the galaxy 
stellar mass function can be derived trivially from the halo 
mass function and has the same features. The galaxy mass 
function derived for m/M = 0.05 is compared to the observed 
SDSS galaxy mass function in FigureQ] The observed galaxy 
mass function is steeper for high masses and shallower for 
low masses than the one derived from the halo mass function. 
Thus, for a constant SHM ratio there will inevitably be too 
many galaxies at the low and high mass end. 

This implies that the actual SHM ratio m/M is not con- 
stant, but increases with increasing mass, reaches a maximum 
around m* and then decreases again. Hence we ad opt the fol- 
lowing parametrization, similar to the one used in I Yang et alJ 
(12001 : 



m(M) 
M 



= 2f") 
VM/o 



M 
Mi 



-0 



M 
M~i 



(2) 



It has four free parameters: the normalization of the stellar-to- 
halo mass ratio (m/M)o, a characteristic mass M\, where the 
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FIG . 1 . — A comparison between the halo mass function offset by a factor of 
0.05 (dashed line), the observed galaxy mass function (symbols), our model 
without scatter (solid line) and our model including scatter (dotted line). We 
see that the halo and the galaxy mass functions are different shapes, implying 
that the stellar-to-halo mass ratio m/M is not constant. Our four parameter 
model for the halo mass dependent stellar-to-halo mass ratio is in very good 
agreement with the observations (both including and neglecting scatter). 

SHM ratio is equal to (m/M)o, and two slopes j3 and 7 which 
indicate the behavior of m/M at the low and high mass ends 
respectively. We use the same parameters for the central and 
satellite populations, since - unlike luminosity - the stellar 
mass of satellites changes only slightly after they are accreted 
by the host halo. 

Note that though both j3 and 7 are expected to be positive, 
they are not restricted to be so. The SHM relation is therefore 
not necessarily monotonic. 

3.2. Constraining the free parameters 

Having set up the model we now need to constrain the four 
free parameters M\, (m/M)o, (3 and 7. To do this, we pop- 
ulate the halos in the simulation with galaxies. The stellar 
masses of the galaxies depend on the mass of the halo and are 
derived according to our prescription (equation The po- 
sitions of the galaxies are given by the halo positions in the 
Af-body simulation. 

Once the simulation box is filled with galaxies, it is straight- 
forward to compute the SMF $,„ 0£ /(m). As we want to fit this 
m odel mass function to the observed mass function ^ b s (m) 
bv lPanteret ail (120071) . we choose the same stellar mass range 
(10 8 5 - 10 11 85 M Q ) and the same binsize. The observed SMF 
was derived using spectra from the Sloan Digita l Sky S urvey 
Data Release 3 (SDSS DR3); see iPanter et all (|2004) for a 
description of the method. 

Furthermore it is possible to determine the stellar mass de- 
pendent clustering of galaxies. For this we compute projected 
galaxy CFs w pmo d(r pi m.i) in several stellar mass bins which 
we cho ose to be the s ame as in the observed projected galaxy 
CFs of lLiet all (120061) . These were derived using a sample 
of galaxies fro m the SDSS DR2 with ste llar masses estimated 
from spectra by Kauff mann et al.l (12003b . 

We first calculate the real space CF £(r). In a simulation 
this can be done by simply counting pairs in distance bins: 



dd(ri) 
N P {n) ' 



1 



(3) 



where dd(ri) is the number of pairs counted in a distance bin 
and Np(ri) = 27rA^ 2 r?Ar,/L^ ox where N is the total number of 
galaxies in the box. The projected CF w p {r p ) can be derived 
by integrating the real space correlation function £(r) along 
the line of sight: 



w p (r p ) = 2 I dr^(Jrf l+ rl) = 2 



dr 



(4) 



where the comoving distance (r) has been decomposed into 
components parallel (r||) and perpendicular (r p ) to the line 
of sight. The integration is truncated at 45 Mpc. Due to 
the finite size of the simulation box (L\, ox =100 Mpc) the 
model correlation function is not reliable beyond scales of 
r~0.1Lbox~ 10 Mpc. 

In order to fit the model to the observations we use Powell's 
directions set method in multidimensions (e.g. Press et al. 
1992) to find the values of Mi, (m/M)o, (3 and 7 that minimize 
either 

2 , 2 __x 2 (<i>) 

Xr 



(mass function fit) or 



x 2 m | x 2 K) 

N$ N r N,„ 



(mass function and projected CF fit) with N<& and N r the num- 
ber of data points for the SMF and projected CFs, respectively, 
and N m the number of mass bins for the projected CFs. 
In this context x 2 (®) an d X 2 ( w p) m ' e defined as: 



N® r 



X 2 (4>) = E 
1=1 

N m N r 

xV)=EE 
1=1 j=i 



*mod('« 1 )-$ob S ('«i) 



w p . mo<i (r p j , ot ; ) - w pfibs (r pJ , nij) 

(J vi',,.„bs(r, ) .,,m,) 



with cr$ obs and a w obs the errors for the SMF and projected CFs, 
respectively. Note that for the simultaneous fit, by adding the 
reduced x 2 , we gi ve the same weight to both data sets. 

3.3. Estimation of parameter errors 

In order to obtain estimates of the errors on the parame- 
ters, we need their probability distribution prob(A where A 
is the parameter under consideration and / is the given back- 
ground information. The most likely value of A is then given 
by: A best = max(prob(A|7)). 

As we have to assume that all our parameters are coupled, 
we can only compute the probability for a given set of param- 
eters. This probability is given by: 

prob(Mi , (m/M) ,f3,-f\I) oc exp(- X 2 ) 

In a system with four free parameters A,B,C and D one can 
calculate the probability distribution of one parameter (e.g. 
A) if the probability distribution for the set of parameters is 
known, using marginalization: 

/oo 
prob(A,B|/)dB 
OO 

) 

prob(A,fl,C,Z)|/)dfidCdD 
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FIG. 2. — Comparison between the model (lines) and observed (symbols with errorbars) projected correlation functions. We show the model results both 
including (solid) and excluding (dashed) orphan galaxies. The models have been derived by fitting to the stellar mass function only. 



Once the probability distribution for a parameter is deter- 
mined, one can assign errors based on the confidence inter- 
vals. This is the shortest interval that encloses a certain per- 
centage X of the area under the posterior probability distribu- 
tion. For the 1-sigma error X = 68% while for the 2-sigma 
error X = 95%. Assuming that the probability distribution has 
been normalized to have unit area we seek A i and A 2 such that 

f Al f°° 1— X 

/ prob(A|/)dA = / prob(A|/)dA = . 

Jo Ja 2 2 

Finally the parameter A is given as A = Ab es t with er+ = 
A2-Abest and er_ = Ab es t _ ^i- The errors derived in this way 
only include sources that have been considered when comput- 
ing x 2 - The calculation of the errors applies for uncorrected 
data points. Since in our case the data points are correlated the 
values of the errors are slightly modified. Also errors caused 
by cosmic variance are not included. 

4. FITTING RESULTS 

Here we present the results we obtain by fitting to the stellar 
mass function only, and for the combined fit to the SMF and 
the projected CF. 

4. 1 . The stellar mass function fit 

First we fit to the SDSS SMF and use the derived best-fit 
parameters to calculate the model projected correlation func- 
tions. Note that for now, we do not take into account any 
possible scatter in the m(M) relation. We will consider scatter 
in g43] 



TABLE 1 

Fitting results for Stellar-to-Halo Mass 
relationship 





log Mi 


(m/M) 


P 


7 






best fit 


11.884 


0.02820 


1.057 


0.556 


1.56 


3.83 




0.030 


0.00061 


0.054 


0.010 






a~ 


0.023 


0.00053 


0.046 


0.004 







NOTE. — No scatter included. All masses are in units of Mq 



We see in Figure [TJthat our fit produces excellent agreement 
with the observed SMF. Using the approach described above 
we also compute the errors on the parameters. The results are 
summarized in Table [TJ 

Having derived the best-fit parameters, we can predict the 
projected CFs. We present the results both including and not 
including orphan galaxies, where we have fitted to the SMF 
for each case. 

Figure |2] shows a comparison between our model and the 
SDSS projected correlation functions in five stellar mass bins 
ranging from logm/M Q = 9.0 to logm/M Q = 11.5 with a bin- 
size of 0.5 dex. The correlation function that has been derived 
without orphans is too low at small scales and can be regarded 
as a lower limit. Neglecting these galaxies results in an under- 
prediction of satellite galaxy clustering. As on small scales 
the projected CF depends mainly on the one-halo term this re- 
sults in the underprediction of w p (r p ). This effect weakens for 
the clustering of more massive galaxies as they are more likely 
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FIG. 3. — Sketch of the probability distributions for a simultaneous fit. The 
solid line corresponds to x 2 ( m ) ar| d the dotted line to x~( w p)- The dashed 
line is the sum of both. Since x 2 ( w p) i s fl at at tne minimum, xLi follows 
X 2 (m) with an offset. The resulting probability distribution does not change 
(after normalization). 

to be central galaxies and thus not effected by tidal stripping 
at all. 

The agreement with the observationally derived w p (r p ) for 
the catalogue including orphaned galaxies is very good, which 
is also reflected in the low value of x 2 ( w p) = 3.83. Note that 
this value has been calculated with the parameters from the 
mass function fit given above and does not correspond to a fit 
to the projected CFs. 

Note that we plot the projected CFs only up to 20Mpc. 
Because of the finite box size, the clustering of host halos 
and thus central galaxies is underpredicted at large scales in- 
dependent of mass. Additionally, due to the lack of long- 
wavelength modes, massive halos and galaxies can be under- 
produced leading to an underprediction of w p for the mas- 
sive objects, independent of scale. However, the latter effect 
is very small, since the abundance of the massive halos in 
our simulation agrees very well with the predicted average 
dSheth & Tormenlll999l) . 

As a test we also used the present mass instead of the maxi- 
mum mass for subhalos. We then found that the projected CF 
was underpredicted particularly on small scales. This effect is 
due to tidal stripping of subhalos and is thus strongest at small 
scales where the subhalo contribution dominates. 

4.2. The combined fit 

We now investigate whether we can improve the agreement 
between the model and the observed projected CFs by per- 
forming a combined fit as described above. We obtain the 
same parameters as those we derived from the fit to the SMF 
alone. This seems surprising, but on further inspection we 
find that this is due to x 2 ( m ) being a lot more sensitive to 
changes of the parameters than x 2 (w p ). This means that if one 
changes the parameters a little in order to improve the fit to the 
projected correlation functions, one can get a slightly better 
agreement between the model and the observed projected CFs 
only at the cost of a large disagreement between the model and 
the observed stellar mass functions. In other words: x 2 ( w p) 
is much flatter around its minimum than X 2 ( w p)> as shown in 
Figure |3] 




FIG. 4. — The derived relation between stellar mass and halo mass. The 
light shaded area shows the lcr-region while the dark and light shaded areas 
together show the 2<r-region. The upper panel shows the SHM relation while 
the lower panel shows the SHM ratio. 



This means that, assuming that both central and satel- 
lite galaxies follow the same SHM relation, the model that 
matches the SMF can reproduce the correct clustering. How- 
ever, if subhalos have a different SHM ratio there is an infi- 
nite number of solutions that match the SMF but produce very 
different correlation functions. The only way to constrain the 
SHM relations then is to take the clustering data into account. 
By adopting different SHM relations for central and satellite 
populations it is even possible to produce a s lightly better fit 
to the correlation functions (Wang et al. 2006|). 

On the other hand, if one wants to predict clustering as a 
function of stellar mass (e.g. at higher redshift) then one has 
to make an assumption about how the SHM ratios of central 
and satellite galaxies are related. We made the very simple 
assumption, that the relation between the stellar mass of cen- 
tral galaxies and the virial mass of their host halo and the re- 
lation between the stellar mass of satellite galaxies and the 
mass of the subhalo at the time of accretion is the same, and 
have shown that this leads to very good predictions for the 
mass dependent clustering. We conclude that under this sim- 
ple assumption we can use our model to predict clustering as 
a function of stellar mass. 

4.3. The resulting stellar-to-halo mass relation 

The upper panel of Figure @] shows the derived stellar mass 
as a function of halo mass. The light shaded area gives the 
68% confidence interval while the dark and light shaded areas 
together give the 95% confidence interval. These have been 
derived using a set of different models computed on a mesh, 
as described in 33.31 
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FIG. 5. — Correlations between the model parameters. The panels show 
contours of constant \ 2 ( l e - constant probability) for the fit including con- 
traints from the SMF only. The parameter pairs are indicated in each panel. 



For the SHM ratio we apply the same procedure. The result 
is shown in the lower panel of Figured We see that the SHM 
ratio has the form we expected: it increases with increasing 
halo mass, reaches its maximum value around Mi and then 
decreases again. 

4.4. Meaning of parameters and correlations 

We now explore the effects of changing each parameter in 
order to understand how they affect the SMF. If we keep Mi , (3 
and 7 fixed and only vary (m/M)o, this corresponds to chang- 
ing the stellar mass of the galaxy that lives inside each halo 
by a constant factor. This has no impact on the form of the 
SMF. Its shape stays the same, while only the position on the 
stellar mass axis changes. Due to the monotonic form of the 
SMF this directly determines the value of the normalization 
cp* . For a larger value of (m/M)o we get a larger value of </)*. 

Varying only Mi we find that the shape of the SMF changes 
drastically. For a higher Mi than our best fit value, we get too 
many massive galaxies and too few low mass galaxies, while 
for a lower value of Mi we get too few massive galaxies and 
too many low mass galaxies. This is because Mi is the char- 
acteristic mass corresponding to the highest SHM ratio. In the 
SMF, this corresponds to the knee and we get a SMF which 
has its knee at the stellar mass corresponding to Mi . For a 
larger Mi the knee is shifted to a higher stellar mass. Together, 
Mi and the maximum stellar-to-halo mass ratio (m/M)o deter- 
mine the normalization of the stellar mass function <fi and the 
characteristic mass m*. 

Changing j3 affects mainly the low mass slope of the stel- 




11 12 13 14 

log, (M/M o ) 



FIG. 6. — Stellar mass as a function of halo mass with u m = 0. 15dex. The 
solid line corresponds to our model without scatter while the points represent 
the model with scatter (note that only 20% of the total number of objects are 
plotted). The relation between halo mass and the average stellar mass for the 
model with scatter is shown by the dashed line. 

lar mass function. For larger values of j3 the slope becomes 
shallower. As (3 influences mainly the slope of the low mass 
end of the SMF, it is strongly related to the parameter a of the 
Schechter function. A small value of (3 corresponds to a high 
value of a. 

If we change 7, this mainly impacts the slope of the massive 
end of the SMF. For larger values of 7 than for its best-fit 
value the slope of the massive end becomes steeper. As 7 
affects mainly the slope of the massive end of the SMF it is 
not coupled to a parameter of the Schechter function though it 
is related to the high-mass cutoff, assumed to be exponential 
in a Schechter function. 

Figure shows the contours of the two-dimensional prob- 
ability distributions for the parameters pairs. We see a cor- 
relation between the parameters [Mi, 7] and [(m/M)o,7] and 
an anti-correlation between [f3,M\] and [(m/M)o,Mi]. 

There does not seem to be a correlation between (m/M) ( )]. 

4.5. Introducing scatter 

Up until now we have assumed that there is a one-to-one, 
deterministic relationship between halo mass and stellar mass. 
However, in nature, we expect that two halos of the same mass 
M may harbor galaxies with different stellar masses, since 
they can have different halo concentrations, spin parameters 
and merger histories. 

For each halo of mass M, we now assign a stellar mass 
m drawn from a log-normal distribution with a mean value 
given by our previous expression for m(M) (Equation (O), 
with a variance of af n . We assume that the variance is a con- 
stant for all halo masses, which means that the percent devi- 
ation from m is the same for every galaxy. This is consistent 
with other halo occu pation models, semi-analytic models and 
satellite kinematic s (ICoor av 2006; van den Bosch et aI1l2007t 
iMore et al.ll2009bl) . 

Assuming a value of a m = 0.15 dex and fitting the stellar 
mass function only, we find the values given in Table|2] These 
values lie within the (2cr) error bars of the best-fit values that 
we obtained with no scatter. The largest change is on the value 
of 7, which controls the slope of the SHM relation at large 
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TABLE 2 

Fitting Results for for Stellar-to-Halo Mass 
relationship 







(m/M) 





7 






best fit 

a" 


11.899 
0.026 
0.024 


0.02817 
0.00063 
0.00057 


1.068 
0.051 
0.044 


0.611 
0.012 
0.010 


1.42 


4.21 



Note. 

M 



Including scatter cr,„ = 0.15. All masses are in units of 



halo masses. The SMF and the projected CFs for the model 
including scatter are shown in Figures Q] and [2] respectively, 
and show very good agreement with the observed data. 

In Figure|6]we compare our model without scatter with the 
model including scatter. We have also included the relation 
between halo mass and the average stellar mass. Especially 
at the massive end scatter can influence the slope of the SMF, 
since there are few massive galaxies. This has an impact on 
7 and as all parameters are correlated scatter also affects the 
other parameters. We thus see a difference between the model 
without scatter and the most likely stellar mass in the model 
with scatter in Figure [6] 

5. THE CONDITIONAL MASS FUNCTION 

In the previous section we derived a model that specifies 
the stellar mass of a central galaxy as a function of the virial 
mass of its host halo and the stellar mass of a satellite galaxy 
as a function of the maximum mass of the subhalo in which 
it lives. It has become common to represent the population 
of host halos by the Halo Occupation Distribution (HOD). 
This includes the halo occupation function P(N\M) which is 
the probability distribution that a halo of mass M contains N 
galaxies (of a specific type). A close relative of t he HOD is 
the "conditional luminosity function" (CLF; e.g. lYang et aT] 
l2003Hvan den Bosch et all2007l:lYang et al.ll2004h . It extends 
the halo occupation function P(N\M) (which gives only infor- 
mation about the total number of galaxies per halo in a given 
luminosity range) and yields the average number of galaxies 
with luminosities in the range L±dL/2 as a function of the 
virial mass M of their host halo. 

We define its analog, the "conditional mass function" 
(CMF), or the average number of galaxies with stellar masses 
in the range m ± dm/2 as a function of the virial mass M of 
their host halo. This provides a direct link between the SMF 
$(m) and the host halo mass function dn(M) / dM: 



$(m) : 



. dn(M) 
$(mAQ dM 
o ' dM 



(5) 



A host halo of mass M can contain a whole population of 
galaxies with different stellar masses m. If we count the num- 
ber of galaxies living in host halos with a virial mass in the 
range M 6 [Mi ,M2] we can compute the SMF of the halo bin 
[Mi,M 2 ]: 



$(W) : 



M 2 



$(m|M) 



Mi 



dn(M) 
dM 



dM«$(ra|M)An (6) 



The tilde over a function represents the fact that it is computed 
in a halo mass bin. We have replaced the integral by a "tophat" 
with a width of An (number of host halos in the bin) and a 
height of $(m|M,„), where M is the geometric mean of the 
minimum and maximum halo masses bracketing the bin. 



This equation allows us to put constraints on Q(m\M) 
by calculating 3>(m)/A«. We can then choose an ade- 
quate parameterization of $(m|M) and fit these parameters to 
$(m)/An in every halo mass bin. Finally we can investigate 
the halo mass dependence of the parameters. 

5.1. Parameterization 

In order to specify the CMF <£>(ra|M) we divide the galaxy 
population into a cent ral and a satellite part, as in the up- 
dated CLF fo r malism dZheng et al .1 [20051: IZehayi et al.ll2005b 
ICoorayl l2006t lYang et alJ l2008t ICacciato et alJ 120081) . The 
central part is $ e (ra|M) and the satellite part is $ J (m|M). Then 
the total CMF is the sum of both parts: 



$(m|M) = $ c (m|M) + $ J (m|M) 



(7) 



Note that both $ c (ra|M) and <& s (m\M) are statistical func- 
tions and should not be regarded as the mass functions of 
galaxies living in a given individual halo. 

For the central population we expect the CMF to have a 
peak around the stellar mass m c that corresponds to the host 
halo's virial mass M in the SHM relation (equation^. Due to 
the halo mass bin size this distribution gets smeared out, be- 
cause halos in the interval [Mi ,M2] contain central galaxies of 
stellar masses m S [»ti(Mi),m2(M2)]. Thus $(;«)/A« will be 
finite inside the interval [m\{M\) 1 m2{M2)\ and zero elsewhere 
with a normalization such that the number of central galax- 
ies per halo equals one. This can be regarded as scatter Chin 
due to the binning. If we add intrinsic scatter a m to relation 
(0, we expect Q c (m\M) to be a lognormal with a maximum 
around m c (M) and a variance of af n . To this scatter the binning 
scatter Obin adds in quadrature (assuming that (Tbin and a m are 
uncorrected), resulting in a total scatter of er; = <J^ t + a^ in . For 
both cases (a m = and a m 0) we use a lognormal distribu- 
tion: 



$ c (m\M) = ■ 



1 



'2n\n 10 m a c 



exp 



log 2 (m/m e ) 



2ct3 



(8) 



where the mean m c (M) and width <r^(M) are parameterized 
functions of the halo mass M. 

For the satellite population we adopt a Schechter func- 
tion with a steeper slope for the massive end. This is done 
by squaring the argument of the exponential function in the 
Schechter function: 



$ s (m|M) = 



<I>, 



exp 



(9) 



Also here the parameters $*(M), m s (M) and a s (M) are func- 
tions of the host halo mass M. They are the normalization, 
the characteristic mass and the low mass slope of the satellite 
population of host halos of mass M. 

5.2. Constraining the conditional mass function 

We populate the halos and subhalos in our simulation 
with central and satellite galaxies according to the prescrip- 
tion in section [3] Then we choose halo mass bins between 
logM/M = 10.2 and logM/M© = 15.0 with a bin size of 
AM = 0.4 dex. In every halo mass bin we seek all galaxies 
which live in a host halo with a mass in that bin, which we 
divide between central and satellite galaxies. For these pop- 
ulations we then compute two seperate SMFs which we nor- 
malize such that the number of central galaxies per host halo 
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o„ = 0.15 
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FIG. 7. — The conditional mass function (CMF) predicted by our model at z = 0. We plot the derived SMFs (dn g /dlogm) in a subsample of halo mass bins. 
The left panels show the CMF for a model without scatter while the right panels show the CMF with scatter of a m = 0.15. The label in each panel is the range 
of host halo mass logM/M©. The stellar mass functions are normalized such that a host halo contains exactly one central galaxy. The total CMF consists of a 
central galaxy part (crosses) and a satellite part (diamonds). The central part is described by a lognormal distribution (solid line) and the satellite part is described 
by a truncated Schechter function (dashed line) using the parameters that were derived by a fit to the CMF. The dotted line shows the completeness limit used in 
the fit to the satellite contribution. 
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equals one. This procedure then yields for every halo mass 
bin a central and a satellite distribution (dh g / dlogM)An/,. 

Using equation © we can now relate the stellar mass func- 
tion in a halo mass bin to the CMF: 
dnJm) 1 



dlogM An h An h 



In 10 dnJm) 

M — ^— - 

AM 

<t(m) 
= lnlOM--- 

An h 

«lnlOM$(m|M) 



(10) 

Now we can fit the five parameters m c (M), a c (M), m s (M), 
$*(M) and a s (M) to the SMFs in each halo bin. We com- 
pute and fit the central and the satellite parts seperately. 

The left panels of Figure [7] show the CMF in a subsam- 
ple of halo mass bins running from logM/M© = 10.2 ± 
0.2 to 15.0 ±0.2, where we have not included intrinsic scatter 
in the SHM relation. For the satellite part, only galaxies with 
a mass above the completness limits for each halo mass bin 
(as indicated in Figure |7J have been used in the fit. 

In low-mass halos (logM/M Q < 11.0) the contribution 
from satellite galaxies is very small and the central contri- 
bution dominates until logM/M© = 12.0. For massive ha- 
los (logM/M Q > 13.0) the satellite contibution dominates by 
number. The mean of the lognormal fit to the central con- 
tribution also increases with halo mass as stipulated by the 
model derived in Section [3] The characteristic mass scale of 
the satellite contribution also increases with halo mass mean- 
ing that the most massive satellite galaxies have a mass which 
is comparable to the mass of the central galaxy. 

The scatter of the central contribution u c {M) decreases with 
halo mass. As we did not include any scatter in the model, 
this scatter reflects the width (0.4 dex) of the halo mass bins 
(<7bin). The halo mass dependence of a c {M) arises because a 
fixed halo mass bin is mapped to a smaller galaxy mass bin 
for larger halo mass due to the shape of the SHM relation. 
Another feature of the CMF is the slope for low mass satellite 
galaxies a s (M) which becomes shallower with increasing halo 
mass. 

5.3. The parameters of the conditional mass function 

In this section we investigate the halo mass dependence of 
the five parameters of the CMF: m c (M), cr c (M), m s (M), $*(M) 
and a s (M). They have been fixed by fitting to the stellar mass 
functions in each halo mass bin. We introduce a parameteri- 
zation in order to describe the dependence on halo mass and 
constrain these by a fit to each parameter. The results are pre- 
sented in Table[3] This provides a complete description of the 
CMF. 

As we have already determined the mean relation between 
the stellar mass of a galaxy and the mass of its halo, the form 
of m c {M) has to be the same and can thus be decribed by equa- 
tion ©: 



m c (M) = 2M 



MJo 



M 

m7c 



-ft 



M 



(11) 



This yields four parameters (m c /M)o, M\ c , f3 c and j c . 

In the upper left panel of Figure [8] m c (M) is plotted as a 
function of halo mass. Note that by construction, it has the 
same form as the SHM relation. 

The scatter of the central galaxy contribution is high for 
low halo masses and decreases for more massive halos. The 
middle left panel of Figure [8] shows <r c (M) as a function of 



TABLE 3 
Parameters of the CMF 





(j„, = 0.0 




cr m = 0.15 




log Mi c 


11.9347 


± 0.0257 


11.9008 


±0.0119 


(m r /M)n 


0.0267 


± 0.0006 


0.0297 


± 0.0004 


Pc 


1.0059 


± 0.0332 


1.0757 


± 0.0097 




0.5611 


± 0.0065 


0.6310 


±0.0121 


logM? 


11.9652 


± 0.1118 


11.8045 


± 0.0458 




0.0569 


± 0.0052 


0.1592 


± 0.0030 


<?\ 


0.1204 


±0.0191 


0.0460 


± 0.0029 




6.3020 


± 3.0720 


4.2503 


± 0.9945 


logM L , 


12.1988 


± 0.0878 


12.0640 


±0.0931 


(m s /M) 


0.0186 


± 0.0012 


0.0198 


± 0.0015 


A 


0.7817 


± 0.0629 


0.8097 


± 0.0971 


Is 


0.7334 


± 0.0452 


0.6910 


± 0.0390 


-log<I>o 


11.1622 


± 0.2874 


10.8924 


± 0.4615 


A 


0.8285 


±0.0215 


0.8032 


± 0.0367 


logM3 


12.5730 


±0.1351 


12.3646 


± 0.0260 




1.3740 


± 0.0066 


1.3676 


± 0.0043 




0.0309 


± 0.0076 


0.0524 


±0.0051 


C 


4.3629 


± 2.6810 


9.5727 


± 6.8240 



NOTE. — The second and third columns give the CMF 
parameters and their errors for a model without scatter while 
the fourth and the fifth columns give the CMF parameters and 
their errors for a model with a scatter of cr„, = 0. 1 5 . All quoted 
masses are in units of Mq 



halo mass. As one can see, a c (M) goes to a constant value 
both for low and high halo masses while it decreases with halo 
mass. We therefore choose the following parameterization: 



<7 C (M) = (Too +Cl 



, 2 M 
1 arctan t log — 

7T V M 2 



(12) 



This yields four more parameters a^, 0\, £ and M 2 . Here, 
(Too sets the high mass limit of er c (M) while o\ sets the dif- 
ference between the low and high mass limits of a c (M). The 
parameter M 2 determines the mass scale at which the transi- 
tion occurs and £ sets the strength. For a large (small) value of 
£ the transition occurs in a small (large) interval around M 2 . 

The specific shape of a c (M) can be explained by the form 
of the SHM relation (equation As we have not included 
any scatter in this relation (a m = 0), the width of the lognor- 
mal function of the central galaxy distribution arises from the 
width of the halo mass bin (cr c = Obm)- A halo mass interval 
[Mi,M 2 ] contains only central galaxies with stellar masses of 
m £ [mi(M\),m 2 (M 2 )]. The lower left panel of Figure[8]illus- 
trates this by showing how halo mass bins affect the bin size 
of the stellar mass. If we choose the same bin size for low and 
high mass halos, we get different bin sizes for low and high 
mass galaxies, due to the changing slope of m(M). Therefore 
the transition occurs where the slope of m(M) changes which 
is around Mi , so the value of M 2 is very close to that value. 

As Figure [7] shows that the satellite contribution falls off 
around the mean mass of the central galaxy, we expect the 
characteristic mass of the modified Schechter function m s (M) 
to follow m c (M). We therefore describe m s (M) with the same 
function we used for the parametrisation of m c (M): 



(in \ 
Wo 



M 



M 



(13) 



This function yields four parameters (m s /M)o, M ls , [3 S and j s . 

The upper right panel of Figure [8] plots m s (M) as a func- 
tion of halo mass. We see that the shape is similar to that of 
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log lo (M/M ) log lo (M/M ) 

FIG. 8. — The five parameters of the conditional mass function as a function of halo mass. The crosses were derived from a fit to the CMF in every halo 
mass bin (assuming no scatter in stellar-to-halo mass relation). The solid line is a fit to the crosses using the respective parameterization. The CMF parameters 
derived with a scatter of a,„ = 0.15 in the stellar-to-halo mass relation are given by the diamonds. The left panels show the central contribution: m c (M) (top), 
<Jc(M) (middle) and an illustration of the behavior of a c (M) (bottom). The right panels show the satellite contribution: m s (M) (top), <E>*(M) (middle) and ot s (M) 
(bottom). The dashed line in the top right panel indicating m c (M) has been added for comparison 



m c (M). Note that m s (M) is always lower than m c (M), while 
the deviation increases with increasing halo mass. This im- 
plies that for high halo masses the satellite contribution to the 
CMF falls off before the mean mass of the central galaxy. 

The normalization of the modified Schechter function is 
small for low halo masses and increases with the mass of the 
host halo. The middle right panel of Figure|8]shows $*(M) as 
a function of halo mass. We see that <£>*(M) can be described 
by a power law and choose the following parametrisation: 

$:(m) = $0 (m9 A (14) 

We get two more parameters, $o and A. The normalization 
of $*(M) is given by <£>o and the slope by A. The shape of 
$*(M) implies that the probability for a host halo to harbor 
satellite galaxies (in a given stellar mass range) increases with 
increasing halo mass. 

The slope of the modified Schechter function for the satel- 
lite contribution becomes shallower for more massive halos. 
The lower right panel of Figure [8] shows a s (M) as a function 
of halo mass and shows that a s (M) goes to a constant value for 
both low and high halo masses. Similar to a c (M), we choose 
the parameterization: 



a s (M) = aoo + ai 



1 arctan 

7T 



M 

Clog — 



(15) 



This yields four more parameters a^, otu C an d M^. Here, 
ctoo sets the high mass limit of a c (M) while ot\ sets the dif- 



ference between the low and high mass limits of a c (M). The 
mass scale at which this transition occurs is determined by M3 
and C sets its strength. The transition occurs in a small (large) 
interval around M3 for a large (small) value of C- 

5.4. The impact of scatter 

Until now, we have used the SHM relation (f2]i without any 
intrinsic scatter. In this section we investigate how the CMF 
and the parameters change if we include a scatter a m as de- 
scribed in section [431 This scatter is again assumed to be 
constant with host halo mass. 

The right panels of Figure [7J show the resulting CMF in 
a subsample of halo mass bins for an intrinsic scatter of 
a m = 0.15. The central part is now no longer near-constant 
in the interval [m(M- AM/2), m(M+ AM/2)] as in the left 
panels of Figure [7] (where a m = 0.0) but has the form of a 
lognormal with a broader distribution for bigger a,„. As the 
scatter has been taken from a lognormal distribution, the cen- 
tral galaxy contribution to the CMF is distributed in the same 
way. Hence, <j c (M) changes with respect to the model that 
does not include artificial scatter. We notice that at the mas- 
sive end the binning scatter cr^ in and the intrinsic scatter o\ 
add to the total scatter <7 t 2 ot . At the low mass end, however, the 
total scatter is less than what has been obtained by using no 
intrinsic scatter. This shows that the two forms of scatter do 
not add in quadrature and indicates that they are correlated. 

We compare m c (M), <r c {M), m s (M), $*(M) and a s (M) for 
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FIG. 9. — Occupation numbers as function of halo mass in stellar mass bins, derived using the conditional mass function. The left, middle and right panels 
show the average number of central, satellite and total galaxies per halo, respectively. 



a m = and a m = 0.15 and show the resulting parameters in Ta- 
ble [3] (columns four and five) and in Figure[8] The mean mass 
of the central galaxy m c (M) does not change much if artificial 
scatter is introduced. The most likely stellar mass of a central 
galaxy is still given by the SHM relation, so the mean of the 
gaussian in logarithmic space stays the same. Also the param- 
eters of the satellite population [m s (M),$*(M) and a s (M)] do 
not change significantly. 

5.5. The occupation numbers 

In order to compare our results to other HOD models it is 
useful to compute the average number of galaxies per halo 
(N), as this is the main prediction of the HOD approach. To 
compute (N)(M) from the CMF we simply integrate <f>(m \M) 
over the desired stellar mass range: 



(N)(M): 



$(m|M)dm 



(16) 



As we have divided $(ra|M) into a central galaxy contribution 
$ c (m|M) and a satellite galaxy contribution <& s (m|M), we can 
compute seperate occupation numbers for central and satellite 
galaxies: 



(N)(M): 



<& c (m|M)dm+ / 3> s (m\M)dm 



:(N C )(M)+(N S )(M) 



The average number of central galaxies per halo (N C )(M) is 
given by 

(^V e >(M) = i[erf(r, 2 )-erf(r ?1 )] , (17) 

with the error-function erf(;t:) and the integration boundaries 

\og(m x /m c ) log(m 2 /m c ) 

m = and 7, 2 = . 

v2cr c V2a c 

The average number of satellite galaxies per halo (N S )(M) is 



(N S )(M) = 



a s 1 

h— ,Kl 

2 2' 



2 2' 



(18) 



with the upper incomplete gamma function T(a,x) and the in- 
tegration boundaries 

K\ = (mi/m s ) 2 and k 2 = (m2/m s ) 2 . 



Figure [9] shows the resulting occupation numbers for the 
values of the CMF parameters that were derived in section 
I5.3l (using a scatter of a m = 0.15). The five lines in each panel 
correspond to different stellar mass bins. 

The left panel shows the average number of central galax- 
ies per halo (N C )(M) as a function of halo mass. In the mid- 
dle panel, the average number of satellite galaxies per halo 
(N S )(M) as a function of halo mass is shown. The right panel 
plots the average number of all galaxies per halo (N lot )(M) as 
a function of halo mass. A galaxy of a low stellar mass can 
thus either be a central galaxy of a low mass halo, or a satellite 
galaxy of a massive halo. It is not likely to live in a halo of 
intermediate mass. 

As it is common in the literature to plot occupation numbers 
not for stellar mass intervals, but for galaxy samples with a 
mass above a given threshold, we need to adjust equations 
( fTTI i and (TT~8b . The stellar mass threshold is then given by m\ 
while 1112 — ► oo. This yields for the average number of central 
galaxies 



(N c )(M, mi )=- 



1-erf 



log(mi/m c ) 
%/2er e 



(19) 



since erf(;t — * oo) — > 1, and for the average number of satellite 
galaxies 

2" 



a s 1 f mi 
2 2' I m, 



(20) 



since T(a,x — > oo) — > 0. 

Figure [10] shows occupation numbers for different stellar 
mass thresholds. The left panel shows the average number of 
central galaxies per halo (N C )(M) as a function of halo mass. 
The middle panel plots the average number of satellite galax- 
ies per halo (N S )(M) as a function of halo mass. It is similar to 
the middle panel of Figure [9] while it is larger at a given halo 
mass. In the right panel the average number of all galaxies per 
halo (Ni ot )(M) as a function of halo mass is shown. 

6. COMPARISON 

6.1. Other HOD models 

Numerous variations on halo occupation models have been 
presented in the literature. In this section we describe some 
of the most popular ones and compare them to our model. As 
many authors use different initial mass functions and defini- 
tions of halo masses, we convert all results to the conventions 
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FIG. 10. — Occupation numbers as function of halo mass for galaxy samples with a stellar mass above a given threshold. The left, middle and right panels 
show the average number of central, satellite and total galaxies per halo, respectively. 



TABLE 4 

Comparison between different models 



TABLE 5 
Fit parameters for Equation {22} 





log Mi 


(m/M) 


P 


7 


Our model 


11.884 


0.0282 


1.06 


0.556 


Non-Parametric 


11.766 


0.0324 


1.43 


0.565 


Wang et al. (2006) 


11.845 


0.0319 


1.42 


0.710 


Somerville SAM 


11.888 


0.0276 


0.98 


0.629 


Croton SAM 


11.742 


0.0405 


0.92 


0.610 


Yang GC 


12.067 


0.0384 


0.71 


0.698 



NOTE. — All quoted masses are in units of Mq 



that we have used in this work (Kroupa IMF and virial over- 
density). 

In the Non-Parametric model dVale & Ostrikerl 120061; 
IConrov et al.l l2006t IShankar e t al. 2006]), galaxy properties, 
such as luminosity and stellar mass, are monotonically related 
to the mass of dark matter halos. Using the observed galaxy 
SMF, the most massive halo is matched to the most massive 
galaxy: 

n g (> to,) = n h (> Md (21) 

In this way, the observed SMF is automatically reproduced. 
Applying this procedure and fitting the parameters of the 
SHM relation to the result, we have derived the values given in 
Table |4] These are in good agreement with the parameters of 
our model, except for (3. We find that this is due to the shape 
of the SHM ratio for low masses. For the Non-Parametric 
model, m(M < Mi) can not be perfectly described by a single 
power law, as is assumed in our model. 

Adding an additional parameter and assuming a fitting 
function with five free parameters, we are able to fit the SHM 
relation predicted by the non-parametric model quite pre- 
cisely. The fifth parameter accounts for the deviation from 
the power-law at high and low masses. Using the parameteri- 
zation 

(M/Mj) 71 

m(M) = too 



[1 + iM/M^ 



(71-72V/3 



(22) 



we determine the values given in Table [5] Figure QT| shows 
the results of four- and five-parameter fits to the SHM relation 
derived via the non-parametric method, compared with our 
usual model. In the range where we applied the mass function 
fit, the non-parametric model lies within our error-bars. 



logm 


log Mi 


71 


72 


P 


10.864 


10.456 


7.17 


0.201 


0.557 


± 0.043 


0.211 


1.16 


0.018 


0.031 



NOTE. — All masses are in units of Mr. 



In lWang et al.l (120061) a model similar to ours is used to con- 
strain the SHM ratio. The halo catalogue i s taken from the 
Millennium simulation (Sprin gel et al.ll20 05): halos are iden- 
tified using a friends-of-friends group finder while substruc- 
ture is found using the SUBFIND algorithm of Springel et al. 
d2001l) . As observational constraints, the authors use a SMF 
which they compute from the SDSS DR2 data using the mass 
estimate s of Kauffmann et al. ( 2003) and the projected CFs of 
iLiet al.l(l2006T) . 

The parameterization they use is similar to ours, with four 
free parameters that can easily be converted to Mi, (m/M)o, 
(3 and 7 and an unconstrained scatter. These are fixed by gen- 
erating a grid of models and the best-fit model is defined as 
the one for which \ 2 = X 2 (^) + X 2 ( w p) is minimal. They find 
that their fit improves if they take a different set of parameters 
for central and satellite galaxies. In Table [4] we compare our 
best-fit parameters with their c entral galaxy best- fit parame- 
ters which have been updated in Wang et al. (2007). We show 
these results in Figure [TT] 

The values of Mi and (m/M)o are in very good agreement 
with our values, but the slopes are both higher, resulting in 
fewer massive and fewer low mass galaxies. The reason for 
the difference in the low mass end is the different simula- 
tion used. As the resolution of the simulation in our model 
is higher, the low mass end can be constrained more tightly. 
For the massive end the difference in 7 can be e xplained by 
the ad ditional unconstrained scatter that is used in Wang et alj 
(2006). As the mass function is steep at high masses and shal- 
low for low masses, a change in the scatter will influence the 
number of massive galaxies strongly, while it will have only a 
small effect on the low mass end. As the other three parame- 
ters Mi, (m/M)o and (3 are coupled to the Schechter function 
parameters, there are two parameters to constrain the slope of 
the massive end of the SMF. This degeneracy can cause the 
difference in 7 between the two models. The fact that in the 
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FIG. 1 1 . — Comparison of the stellar-to-halo mass relation m(M) between 
our model (solid line), models from other authors and galaxy-galaxy lensing 
(symbols). The blue areas are the 1-cr and 2-a levels and the error-bars on 
the symbols are the 1-u levels of the halo mass. 

Millennium simulation the cosmology is different to that of 
our simulation also affects the value of the parameters. 

6.2. Gravitational lensing 

The relation between stellar mass and halo mass can be con- 
strained observationally using galaxy-galaxy lensing. Gravi- 
tational lensing induces shear distortions of background ob- 
jects around foreground galaxies, allowing the mass of the 
dark matter halo to be estimated. Mandelb aum et al] d2005l 
2006) have used SDSS data to calibrate the predicted signal 
from a halo model which has been derived from a dissipation- 
less simulation. They have extracted the mean halo mass as a 
function of stellar mass. The lensing data for combined early 
and late-type galaxies (Mandelbaum, private communication) 
are shown in Figure Q~T] and are in excellent agreement with 
our model. 

6.3. Semi-analytic models 

As we discussed in the introduction, semi-analytic models 
(SAMs) of galaxy formation attempt to predict the relation- 
ship between dark halo mass and stellar mass by a priori mod- 
elling of physical processes, such as the growth of structure, 
cooling, star formation, and stellar and AGN feedback. We 
compare our results with pre dictions from the latest versio n 
of t he semi-analytic models of lSomeryill e & Primack ( 1999); 
see ISomerville et ail d2008l) . For this we compute the mean 
stellar mass of central galaxies as a function of the mass of 
the host halo in halo mass bins. The results are shown in 
Figure [TT] and are in good agreement with our model. This 
is not surprising, as the physical parameters in the model 
of lSomeryille et all (l2008h have been tuned to match the ob- 
serve d stellar ma s s funct ion at z = 0. 

In IWang et ail d2006]) the authors use the semi-analytic 
model of lCroton et al.l d2006l) and link galaxy properties, such 
as the stellar mass, to the mass of the halo in which the 
galaxy was last a central object M; n f a n. They fit the same four- 
parameter function that they used for their empirical model 
(described above) to obtain the parameter estimates from the 
SAM. We summarize these results in Table |4] and show them 
in Figure fTTI 



The two slopes are in very good a greement wi t h our r esults. 
However, the normalization in the Crot on et al.l (|2006) SAM 
is ~ 25% higher and the cha racteristic mass is ~ 25% lower 
than what we found and what Wan g et all d2006l) find fo r their 
model. This is because the SAM of lCrotoriet al. (2006j) does 
not produce a perfect fit to the observed SMF 

6.4. SDSS group catalogue 

Another direct way of studying galaxy properties as a func- 
ti on of halo mass is u sing the SDSS group catalogue presented 
in lYang et al.ld2007l) . In this approach, galaxies are first linked 
together into "groups" using a friends-of-friends algorithm. 
Each group is then assigned a total halo mass b y matching 
to the theoretical dark matter halo mass function. I Yang et all 
(2008) present the relation between the mean stellar mass of 
the central galaxy and the host halo mass. We fit the param- 
eters of equation |2] to their relation and present the results in 
Table g] 

We note that the characteristic mass and the normalization 
derived from the group catalogue are both higher than our 
model parameters. The high mass slope of the SHM relation 
in the group catalogue is shallower than that of our model. 
The low mass slope is also shallower, however, the constraints 
on the low mass slope in the group catalogue are weak, since 
the lowest halo masses are log(M /M Q ) ~ 1 1 .7. This can also 
be seen in Figure [TT] where we show the SHM relation of the 
group catalogue for comparison. 

7. HIGH REDSHIFT 

The discussion in the previous sections has focussed solely 
on the present day universe. In this section we extend our 
analysis to higher redshifts and derive the redshift dependence 
of the stellar-to-halo mass relation. Having chosen a particu- 
lar observed stellar mass function at a given redshift, we can 
investigate how the parameters of the SHM ratio change with 
time. This allows us to learn about the evolution of galaxies. 
Also, with this information, we can populate the A^-body sim- 
ulation snapshots with galaxies at different redshifts using the 
appropriate redshift dependent SHM relation, and then use the 
spatial information from the simulation to compute the stellar 
mass dependent correlation functions. 

Since at the present time there are no high redshift (z > 
1) clustering data as a function of stellar mass available, we 
fit the four parameters of equation <(2J to the o bserved SMFs 
at a given redshift. We argued in section 14.21 that, under the 
assumption that central and satellite galaxies follow the same 
SHM relation, the SMFs provide much stronger constraints 
on the SHM ratio than the clustering data. Thus we should 
be able to use our model to predict clustering as a function of 
stellar mass at any redshift. 

7.1. Which survey for which redshift 

In order to constrain the SHM relation we have to first se- 
lect observational stellar mass functions at the redshifts we 
want to investigate. Because of the trade-off between survey- 
ing large areas and obtaining deep samples, measurements of 
the SMF at high redshift tend to suffer from limited dynamic 
range. Therefore it is important to think about how the con- 
straints on our four SHM function parameters arise from the 
observations. 

The characteristic mass M\ and the maximum SHM ratio 
(m/M)o mostly depend on galaxies and halos of intermedi- 
ate mass. The high mass slope 7 is fixed by the number of 
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FI G. 12. — Comparis on between the model and the observed ste llar mass functions for different redshifts. The observed stellar mass functions are taken 
from Drory et al. 12004) (for z 0.9) and from Fontana et al. (2006) (for z. 1.1) and are represented by the symbols. The model stellar mass functions have 
been fitted to the observations and are represented by the solid lines. The dashed lines are the theoretical mass function we obtain from the redshift-dependent 
parameterization. The redshift is indicated at the top of each panel. 



massive galaxies since these live in the massive halos. On the 
other hand, the low mass slope (3 is set by the number of low 
mass galaxies since these live in the low mass halos. 

For a survey with a fixed area on the sky, the observed vol- 
ume is smaller for low redshifts (z < 1) than for high redshifts. 
In order to compute the SMF at high galaxy masses, the ob- 
served volume has to be relatively large, as massive galaxies 
are rare. Thus for low redshifts one has to choose a wide 
survey (large area) to determine the SMF for massive galax- 
ies and properly constrain 7. Constraining the SMF at the 
low mass end requires a high level of completeness for low 
mass galaxies, which are very faint objects. Hence we have to 
choose a deep survey that can detect faint galaxies in order to 
constrain /?. 

Taking these considerations int o account, we c hoose the 
stellar mass functions presented in lDrorv et alj d2004l) to con- 
strain the parameters Mi, (m/M)o and 7 at low redshifts. The 
authors derive the SMFs using MUNICS which is a wide area, 
medium-deep survey selected in the K band. The detection 
limit is K « 19.5 and the subsample the authors use covers 
0.28 deg 2 . We apply our method using these mass functions 



and take the three parameters from that analysis. 

However, the MUNICS survey is not deep enough to de- 
tect galaxies that are fainter than the characteristic mass of 
the SMF (the knee) and thus is not sufficient to constrain the 
parameter (3. To con strain j3 we choose the SMFs derived in 
IFontana etail (|2006). This work is based on the GOODS- 
MUSIC sample, a multicolor catalogue extracted from the 
survey conducted over the Chandra Deep Field South. The 
catalogue is selected in the Z&50 and K bands, covers an area 
of 143.2 arcmin 2 , and is complete to a typical magnitude of 
K ss 23.5. We apply our method using the SMFs computed 
with the Z850 band selected sample and take the parameter (3 
from that analysis. 

For high reds h ift (z > 1) we use the SMFs presented 
in IFontana et alJ ((2006) to constrain all four parameters. 
For high redshifts, the volume of a redshift bin becomes 
large enough to sample massive galaxies, and therefore the 
GOODS-MUSIC sample is sufficient to constrain 7. 

We convert all SMFs which use a Salpeter initial mass func- 
tion to the Kroupa/Chabrier initial mass function. 
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FIG . 13. — Evolution of the stellar-to-halo mass relation parameters with redshift. The symbols correspond to the derived values while the solid line is a fit to 
the data. For M\ , (m/M)o and 7 this is a power-law, while for /3 it is a straight line. 



TABLE 6 

Stellar-to-halo mass ratio parameters for 
different redshifts 



z 


logMj 


± 


(m/M) 


± 


P 




+ 


7 


± 


0.0 


11.88 


0.02 


0.0282 


0.0005 


1.06 


0.05 


0.05 


0.56 


0.00 


0.5 


11.95 


0.24 


0.0254 


0.0047 


1.37 


0.22 


0.27 


0.55 


0.17 


0.7 


11.93 


0.23 


0.0215 


0.0048 


1.18 


0.23 


0.28 


0.48 


0.16 


0.9 


11.98 


0.24 


0.0142 


0.0034 


0.91 


0.16 


0.19 


0.43 


0.12 


1.1 


12.05 


0.18 


0.0175 


0.0060 


1.66 


0.26 


0.31 


0.52 


0.40 


1.5 


12.15 


0.30 


0.0110 


0.0044 


1.29 


0.25 


0.32 


0.41 


0.41 


1.8 


12.28 


0.27 


0.0116 


0.0051 


1.53 


0.33 


0.41 


0.41 


0.41 


2.5 


12.22 


0.38 


0.0130 


0.0037 


0.90 


0.20 


0.24 


0.30 


0.30 


3.5 


12.21 


0.19 


0.0101 


0.0020 


0.82 


0.72 


1.16 


0.46 


0.21 



NOTE. — For Mi , (m/M)o and 7 the errors are drawn from a Gaussian 
and thus are symmetric (indicated by the symbol ±). For /3 the errors 
are drawn from a lognormal distribution and thus there is a lower error 
(indicated by the symbol -) and an upper error (indicated by the symbol 
+). All quoted masses are in units of Mq 



7.2. Evolution of the parameters 

Having selected the observational SMFs for a set of differ- 
ent redshifts, we fit the four free parameters M\, (m/M)o, (3 
and 7 to the observations. The errors on the parameters are 
derived in a similar way as explained in section 13.31 but in- 
stead of using confidence intervals we have fitted a Gaussian 
to the probability distributions of Mi, (m/M)o and 7 and a 
lognormal to the probability distribution of (3. 

Figure [12] shows the observed and the model stellar mass 
functions for different redshifts (indicated at the top of each 
panel). The values of the resulting four parameters for the dif- 
ferent redshifts are shown in Table [6] and the redshift evolu- 
tion is plotted in FigureQj] The characteristic mass M\ grows 
with increasing redshift, while the normalization of the SHM 



ratio (m/M)o becomes smaller with increasing redshift. This 
means that there is less stellar content in a halo of a given 
mass at a higher redshift. 

The high mass slope 7 can be constrained only weakly. This 
is due to the limitation of the available galaxy surveys. As the 
area of the survey is small, the volume in which galaxies are 
detected is limited, and thus massive galaxies are very rare. 
This results in large error bars for the SMF for massive galax- 
ies which propagate into the error bars of 7. The situation 
improves slightly for higher redshifts as the volume of higher 
redshift bins is larger and thus more massive galaxies can be 
observed. The value of 7 decreases with increasing redshift. 
For higher redshifts (z > 1) the error bars on 7 become very 
large because of the limited area covered by the available deep 
surveys (in this case, GOODS). We leave it up to the reader 
to assess the reliability of our results at z > 1 based on our 
quoted error bars. 

The low mass slope (3 seems to increase with redshift until 
z ~ 2 and then drops to a low value. However, as the redshift 
increases it becomes more and more difficult to observe low 
mass galaxies which are very faint. Thus the high redshift 
values for j3 are not very well constrained and perhaps not 
to be fully trusted. We therefore assume that [3 grows with 
increasing redshift. 

As we explained in Section [4~4l (3 is strongly related to the 
parameter a of the Schechter function. A small value of f3 cor- 
responds to a large absolute value of a while a large value of (3 
results in a low absolute value of a. This would mean that for 
higher redshifts the stellar mass function would become shal- 
lower , in contradiction with observations (e.g. iFontana et alj 
2006 show that the absolute value of a increases with red- 
shift). However, one has to remember that the halo mass func- 
tion also changes with redshift and becomes steeper. Thus the 
halo mass function steepens more than the SMF, so (3 has to 
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TABLE 7 

Parameters for redshift dependent stellar-to-halo mass 
relation 



Ml \ z= o M (m/M) z=0 v 70 71 /3 ft 

11.88 0.019 0.0282 -0.72 0.556 -0.26 1.06 0.17 ^ 
± 0.01 0.002 0.0003 0.06 0.001 0.05 0.06 0.12 s ° 10 



NOTE. — All quoted masses are in units of Mq 

increase in order to compensate. 

With the derived parameter values it becomes possible to 
interpolate and find the SHM ratio at any redshift. This is 
done by choosing a redshift-parameterization for each of the 
parameters. 

As Mi and (m/M)o do not change much above a redshift of 
z > 1 .5 we choose power laws for the redshift dependence: 

logM I (z) = logM 1 | z=0 -(z+iy i . (23) 

with the normalizations Mo and (m/M)~ = o and the slopes /1 
and v. 

To parameterize 7 over redshift, a linear dependence would 
lead to a negative 7 at a certain redshift. Though this is not 
forbidden, it leads to a SHM ratio which would be increas- 
ing monotonically with halo mass which is inconsistent with 
feedback processes at the massive end. Hence we also choose 
a power-law parameterization for 7: 

7(z) = 7o -(z+ir . (25) 

with the normalization 70 and the slope 71 . 

From Figure Qj] we are not able to infer whether (3 con- 
verges to a constant value. Thus we adopt a simple linear 
parameterization: 

I3{z) = (3i-z + I3o. (26) 

Note that we have also tried other parameterizations (constant 
(3, decreasing (3) but could not reproduce the observed stellar 
mass functions. Using the linear parameterization for (3 and 
the power laws for the other parameters we were able to com- 
pute stellar mass functions that are in good agreement with 
the observed ones. 

A fit to the derived values presented in Table [6] yields the 
parameters given in Table [7] As we do not fully trust the de- 
rived values of (3 for z > 2 we neglect these two values and fit 
a line to the remaining values of (3. 

7.3. 77ie stellar-to-halo mass relation for different redshifts 

Having developed a redshift dependent model of the stellar- 
to-halo mass relation we now test this model by computing 
interpolated stellar mass functions for different redshifts. For 
this we use the method described in section [3] However, now 
we do not use the parameters that have been derived at each 
redshift by fitting the model to the observations but we use the 
eight parameters of the redshift dependent SHM relation that 
have been derived in the previous section. 

The resulting interpolated SMFs are compared to the ob- 
servations (and the fitted mass functions) in Figure [T2] For 
z < 2 we see excellent overall agreement, the interpolated 
mass functions mostly overlap with the error bars of the ob- 
servations. 
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FIG. 14. — Stellar mass as a function of halo mass for different redshifts. 
The solid lines show different redshifts, which are indicated at the top of the 
panels. 

The SMFs for the high redshifts z > 2 are too low. The 
deviations are largest at the low mass end. However, if we 
look at Figure [12] we see that (3 is higher than the derived 
value for the two highest redshifts which results in a low mass 
slope that is too shallow. 

To compare the relation at different redshifts, we use the 
redshift dependent SHM relation with the eight parameters 
that have been derived in the previous section. Figure[T4lplots 
stellar mass versus halo mass for different redshifts. The plot 
shows that at a fixed low halo mass (e.g. M = 10 1 'M Q ), galax- 
ies that live in such halos are more massive at low redshift 
(m ~ 10 9 Mq for z = 0) than galaxies that live in a halo of the 
same mass at a higher redshift (m ~ 10 8 M Q for z = 2). In 
contrast, massive halos contain more massive galaxies at high 
redshift, while at low redshifts the galaxies in massive halos 
have less mass. However, as halos also become more mas- 
sive over time, one cannot identify a halo of a certain mass at 
high redshifts with a halo of the same mass at low redshifts. 
Thus the fact that at a given (high) halo mass the mass of 
the central galaxy is lower at present than at an earlier epoch 
does not imply that individual galaxies lose mass during their 
evolution. This only means that large halos accrete dark mat- 
ter faster than large galaxies grow in stellar mass, while the 
growth of low mass halos is slower than that of the c entral 
galaxies they harbor (see also lConroy & Wech sler 200g). Be- 
cause of its statistical nature, our model is not suitable for 
following the evolution of an individual galaxy through cos- 
mic time. We also note that the SHM relation at the massive 
end (M > 10 13 M(p) undergoes very littl e evolution, which has 
also been found bv lBrown et al.l (120081) . 

7.4. Clustering at higher redshift 

Having determined the SHM relation as a function of red- 
shift we are now able to populate halos with galaxies at any 
redshift. We choose a set of redshifts and populate the ha- 
los with galaxies, deriving the stellar masses from the redshift 
dependent SHM relation. We divide these galaxies into six 
samples of different stellar mass between logm/M Q = 8.5 and 
1 1 .5. For each of these samples we compute the real space CF 
£(r) by counting pairs in distance bins (equation^. This leads 
to six CFs for every selected redshift. 
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FIG. 15. — Correlation functions as a function of stellar mass at high redshift. The different panels correspond to different redshifts, which are given at 
the bottom of each panel. The different lines are correlation functions for six stellar mass bins, which are given in the upper left panel. The error-bars on the 
most massive sample are from Poisson statistics. The correlation function of dark matter particles (thick solid line) at the respective redshifts is also shown for 
comparison. At high redshift the correlation function of the massive samples is only shown on large scales, since there is no relevant one-halo term. 
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FIG. 16. — Comparison between the model (lines) and observed (symbols) 
projected correlation functions at 0.2 < z < 1.2. The upper and the left panels 
show the zCOSMOS data in three redshift bins while the lower right panel 
shows the VVDS data. The different lines and symbols in each panel are for 
different stellar mass bins and thresholds as indicated in the panels. 



Figure[I3]shows the CFs for six different redshifts as a func- 
tion of stellar mass. We also plot the correlation function of 
dark matter at the respective redshifts for comparison. For 
all redshifts we see that massive galaxies are clustered more 
strongly than low mass galaxies. The higher the redshift, the 
more the CFs for different stellar masses differ. For high red- 
shift, there are very few massive galaxies in our limited vol- 
ume simulation box, and so the error bars become larger. 

At low redshift (z < 1), observational measurements of stel- 
lar mass dependent galaxy clustering have recently been pub- 
lished using the VI MOS-VLT Deep Survey (V VDS) and the 
zCOSMOS Survey dMeneux et al.ll2008l 120091) . In order to 
compare our model predictions to these data, we compute cor- 
relation functions for the same stellar mass bins and thresh- 
olds and convert these to projected correlation functions as de- 
scribed in section [3721 Figure [l6]plots the observed projected 
correlation functions (symbols) and the model predictions 
(lines) for different stellar mass bins or thresholds in three 
redshift bins for the zCOSMOS Survey and one redshift bin 
for the VVDS. There is good general agreement between the 
model and observations. The zCOSMOS clustering ampli- 
tude agrees very well with the model for r p < 1 Mpc, but for 
z < 0.8 deviates at larger distances and becomes higher than 
the prediction. As suggested by iMeneux et alJ (l2009h . this 
may be because the COSMOS field represents an overdense 
volume at these redshifts. In contrast, the VVDS clustering 
amplitudes are lower than those predicted by our model, lead- 
ing to the speculation that perhaps the VVDS represents an 
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FIG. 17. — The galaxy bias at a fixed scale (si 6 Mpc) as a function of red- 
shift for different stellar masses. The symbols have been derived by averaging 
the bias over a distance interval while the lines are fits to the symbols. 



underdense region. 

7.5. The galaxy bias 

The bias of any object may be defined as the square root of 
the ratio between the CF of the object £ e (r) and the CF of dark 
matter particles £d m (r): 



b(r) = 



£,dm(r) 



(27) 



Here we focus on the galaxy two-point CF £ gg (r,m,z), which 
in addition to the distance between the galaxies also depends 
on the redshift and the stellar mass of the galaxies: 



b(r,m,z) = \ —- — — 



(28) 



From our predicted galaxy CFs, we compute the bias for ev- 
ery redshift and stellar mass by averaging between r = 2 Mpc 
and 10 Mpc, where b(r) is roughly a constant (as one can 
see from Figure [15] the scale dependence of the bias is quite 
weak). Figure [P71 shows the redshift dependence of the bias. 
The symbols represent the averaged value of the bias while 
the solid lines correspond to a fit to the symbols. For this we 
have used a power law form: 



b(z) = b (z+ir+b 2 



(29) 



where the parameters bo, b\, and bz are functions of stellar 
mass. The fit parameters are given in Table [8] 

This shows that the bias at a fixed stellar mass increases 
with increasing redshift. Massive galaxies are biased more 
strongly than galaxies of lower mass at any redshift. We 
find that the bias of massive galaxies evolves more rapidly 
than that of low mass ones (cf. lWhite et alJ2007tlBrown et al.1 
2008). Since the bias of massive halos evolves more rapidly 
than that of low mass galaxies, this seems to be a feature of 
any model in which the SHM relation is monotonically in- 
creasing (i.e. the most massive galaxies reside in the most 
massive halos). 



TABLE 8 
Galaxy bias fit parameters 



log trig 



8.5- 9.0 
9.0- 9.5 
9.5 - 10.0 
10.0- 10.5 
10.5 - 11.0 
11.0-11.5 



0.062 ± 0.017 
0.074 ± 0.008 
0.042 ± 0.003 
0.053 ± 0.014 
0.069 ± 0.014 
0.173 ± 0.035 



2.59 ±0.18 
2.58 ±0.26 
3.17 ±0.05 
3.07 ±0.17 
3.19 ±0.13 
2.89 ± 0.20 



1.025 ±0.062 
1.039 ±0.028 
1.147 ±0.021 
1.225 ±0.077 
1.269 ±0.087 
1.438 ±0.061 



NOTE. — All quoted masses are in units of Mq 



8. CONCLUSIONS 

The goal of this paper is to characterize the relationship be- 
tween the stellar masses of galaxies and the masses of the dark 
matter halos in which they live at low and high redshift, and to 
make predictions of stellar mass dependent galaxy clustering 
at high redshift. 

We used a high-resolution A^-body simulation and identified 
halos and subhalos. Halos and subhalos were populated with 
central and satellite galaxies using a parameterized SHM re- 
lation. For host halos the mass was given by the virial mass 
M v ir while for subhalos we used the maximum mass of the 
halo over its history M max since we expect the stellar mass of 
the satellite galaxy to be more tightly linked to this quantity. 

We described the ratio between stellar and halo mass by 
a function with four free parameters, a low-mass slope 0, 
a characteristic mass Mi, a high-mass slope 7, and a nor- 
malization (m/M)o. We fit for the values of these param- 
eters by requiring that the observed galaxy SMF is repro- 
duced. We find that the SHM function has a characteris- 
tic peak at Mi ~ 1O 12 M , and declines steeply towards both 
smaller mass (/3 ~ 1) and less steeply towards larger mass 
halos (7 ~ 0.6). The physical interpretation of this behavior 
is the interplay between the various feedback processes that 
impact the star formation efficiency. Supernova feedback is 
more effective at reheating and expelling gas in low mass ha- 
los, whi le AGN feedback is more effective in high mass ha- 
los (e.g.|Shankar et al.l|200d ICroton et alJl200d iBower et alJ 
l2006t lSomervill e et al.ll2008h . In this picture, the characteris- 
tic mass Mi is the halo mass where the efficiency of these two 
processes crosses. 

We have thoroughly discussed the meaning of the parame- 
ters. We have also investigated the effects on the SHM rela- 
tion that arise from introducing scatter to the relation. To do 
this we have added scatter drawn from a lognormal distribu- 
tion with a typical variance of a m = 0. 15 to the SHM function. 
We showed that the impact of such a scatter on three of the 
four parameters is negligible, with a small but significant im- 
pact on the high-mass slope 7. 

We showed that adding constraints from stellar mass depen- 
dent galaxy clustering did not change the values of our best-fit 
parameters. Put another way, the likelihood (here \ 2 ) function 
for the clustering constraint is much "flatter" than that for the 
mass constraint, so adding the clustering constraint does not 
significantly change the distribution for the most likely (best- 
fit) parameter values. Fitting to the SMF only, we found that 
the observed projected CFs of galaxies for five samples of dif- 
ferent stellar mass were reproduced well. This means that the 
clustering properties of galaxies are predominantly driven by 
the clustering of the halos and subhalos in which they reside. 
From this we concluded that our model can predict clustering 
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as a function of stellar mass at any redshift. 

In order to describe how galaxies of different masses pop- 
ulate host halos, we introduced the conditional mass function 
<J>(ra|M), which yields the average number of galaxies with 
stellar masses in the range m±dm/2 that live in a distinct 
halo of mass M. It is described by five parameters which 
are functions of halo mass. We divided the conditional mass 
function into a contribution from central galaxies (described 
by a lognormal distribution) and a contribution from satellite 
galaxies (described by a modified Schechter function). We 
computed the SMF in different halo mass bins and fitted the 
five parameters in each bin. Introducing halo mass dependent 
functions for every parameter and fitting these to the derived 
values of the parameters in the halo mass bins, we determined 
the halo mass dependence of the five parameters and thus fully 
described the conditional mass function. We also computed 
the occupation numbers of halos which give the average num- 
ber of galaxies of a given stellar mass that live inside a halo 
of mass M. 

We compared the results for our SHM function with those 
that have been derived using other approaches. These include 
other halo occupation type models, gravitational lensing and 
semi-analytic models. We showed that all methods yield con- 
sistent SHM relations. 

Using SMFs at higher redshifts, we applied our model at 
earlier epochs of the universe. We thus constrained the SHM 
relation at a given set of redshifts between z = and z ~ 4. 
This allowed us to study how the four parameters of the SHM 
function depend on redshift. For each parameter we intro- 
duced a redshift dependent function. We found that the char- 
acteristic mass increases with redshift while the normalization 



decreases with redshift. This indicates that there is less stellar 
content in halos at higher redshifts. As the halo mass function 
steepens more with redshift than the stellar mass function, the 
low mass slope increases with redshift. We present an eight 
parameter fitting function describing the redshift dependent 
SHM relation. 

Using the SHM relation that we derived in this way, along 
with spatial information for halos from the A/-body simula- 
tion, we predicted the high-redshift real space CFs for five 
stellar mass intervals. We find that for all redshifts, massive 
galaxies are more clustered than galaxies of lower mass. Us- 
ing the real space CF of dark matter we calculated the galaxy 
bias as a function of distance, redshift and stellar mass. Aver- 
aging over spatial scale in an interval around r « 6 Mpc, we 
demonstrated that the galaxy bias increases with redshift, and 
presented fitting formulae for the galaxy bias as a function of 
stella r mass and redshift. In a companion paper (Most eret al.l 
2009) we will use these bias results to present predictions for 
the cosmic variance a c for galaxies of different stellar mass. 
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